Increasingly detailed insights in animal behaviours using continuous on-board processing of accelerometer data

Background: Studies of animal behaviour, ecology and physiology are continuously benefitting from progressing biologging techniques, including the collection of accelerometer data to infer animal behaviours and energy expenditure. In one of the most recent technological advances in this space, on-board processing of raw accelerometer data into animal behaviours proves highly energy-, weight- and cost-efficient allowing for continuous behavioural data collection in addition to regular positional data in a wide range of animal tracking studies. Methods: We implemented this latest development in collecting continuous behaviour records from 6 Pacific Black Ducks Anas superciliosa to evaluate some of this novel technique’s potential advantages over tracking studies lacking behavioural data or recording accelerometer data intermittently only. We (i) compared the discrepancy of time-activity budgets between continuous records and behaviours sampled with different intervals, (ii) compared total daily distance flown using hourly GPS fixes with and without additional behavioural data and (iii) explored how behaviour records can provide additional insights for animal home range studies. Results: Using a total of 690 days of behaviour records across six individual ducks distinguishing eight different behaviours, we illustrated the improvement that is obtained in time-activity budget accuracy if continuous rather than interval-sampled accelerometer data is used. Notably, for rare behaviours such as flying and running, error ratios > 1 were common when sampling intervals exceeded 10 min. Using 72 days of hourly GPS fixes in combination with continuous behaviour records over the same period in one individual duck, we showed behaviour-based daily distance estimation is significantly higher (up to 540%) than the distance calculated from hourly sampled GPS fixes. Also, with the same 72 days of data for one individual duck, we showed how this individual used specific sites within its entire home range to satisfy specific needs (e.g. roosting and foraging). Conclusion: We showed that by using trackers allowing for continuous recording of animal behaviour, substantial improvements in the estimation of time-activity budgets and daily traveling distances can be made. With integrating behaviour into home-range estimation we also highlight that this novel tracking technique may not only improve estimations but also open new avenues in animal behaviour research, importantly improving our knowledge of an animal’s state while it is roaming the landscape. Supplementary Information The online version contains supplementary material available at 10.1186/s40462-022-00341-6.

To formulate it differently, the data for animal h at interval length i constitute a realization of a random vector (X h,i,1 , . . . , X h,i,k ) with distribution Mult(n h,i , p h,1 , . . . , p h,k ). Note that X h,i,j has a binomial distribution with parameters n h,i and p h,i,j and that hence the coefficient of variation for the jth category equals The influence of the interval length We are interested in the size of the standard deviation in estimating p h,i,j relative to the value of p h,i,j , in particular in how this error ratio depends on i . Now p h,i,j is estimated by X h,i,j /n h,i and according to Section 1 this estimator has standard deviation Consequently the error ratio equals (cf. (1.1)) and hence ln r h,i,j = − 1 2 ln c h + 1 2 ln i − 1 2 ln p h,i,j + 1 2 ln(1 − p h,i,j ). (2.5) This means that the error ratio is a square root function of the interval length as visible in Figure 3; note also that the regression mentioned in the caption of Figure 3 basically resembles (2.5). The error ratio can be estimated using (cf. (2.4)) Note that this estimator might take the value infinity, since X h,i,j = 0 has positive probability.

Markov Chains
Consider observations of animal h taken at interval length i . A natural model for the behaviour of animals is that the present behaviour has some influence on the behaviour in the immediate future, more precisely, the probability p g,j that the behaviour at time t + i is j, given the behaviour at time t is g, might deviate from p j . This leads to a Markov chain as model for the consecutive observations. In our case we have k behaviours (or states in the Markov Chain terminology). The probabilities p g,j , g = 1, . . . , k, j = 1, . . . , k, can be arranged in a k × k-matrix (the transition matrix) in which the gth row contains the probabilities p g,j , j = 1, . . . , k. In case of independence between consecutive points in time all rows in the matrix will be the same and equal (p 1 , . . . , p k ). Note that the continuous observations yield a very accurate estimate of (p 1 , . . . , p k ).
To test if the gth row of the transition matrix equals (p 1 , . . . , p k ) one often applies the Pearson χ 2 test; see e.g. https : //en.wikipedia.org/wiki/Chi − squared test. Let n g be the number of observations (of animal h at interval length i ) at which the animal shows behaviour g (or is in state g). Let X g,j be the number of times the behaviour changes from g to j; so, k j=1 X g,j = n g . Now the Pearson χ 2 test statistic equals T g = k j=1 n g (X g,j /n g − p j ) 2 p j = k j=1 (X g,j − n g p j ) 2 n g p j . (3.7) Under the null hypothesis of independence, i.e., the hypothesis that all rows of the transition matrix are the same, the test statistic T g has approximately a χ 2 distribution with k − 1 degrees of freedom. The fact that n g is a random variable doesn't matter here. Given n g the random variable X g,j has a binomial distribution with parameters n g and p j .
Since we want to test if all rows of the transition matrix are the same, it makes sense to consider the test statistic which has approximately a χ 2 distribution with k(k − 1) degrees of freedom under the null hypothesis for reasonably large values of n g .
we don't know, but which is estimated by (X g,1 /n g , . . . , X g,k /n g ) = (q 1 , . . . , q k ). In case of independence this k-vector should be close to the k-vector (p 1 , . . . , p k ). Now, it seems natural to look for a suitable distance measure to determine the distance between these two k-vectors, p = (p 1 , . . . , p k ) and q = (q 1 , . . . , q k ). Note that p and q are elements of the so-called (k − 1)-simplex, which is the collection of k-vectors with nonnegative components that add up to 1.
A simple distance is the L 1 -distance, which has many exotic names; see https : //en.wikipedia.org/wiki/T axicab geometry. It is defined by (4.9) whereas the standard Euclidean or L 2 -distance is defined by (4.10) Let p min be the minimum value among p 1 , . . . , p k . Then one can show (4.11) Consequently it seems natural to choose as a dependence measure (for each row g and each i ) since the value of ∆ 1 (p, q) is in between 0 and 1. Here 0 corresponds with independence in the Markov Chain when starting from state/behaviour g and 1 corresponds with the worst possible dependence. An important issue is, what values of ∆ 1 (p, q) are close enough to 0 to make results based on monitoring with interval length i sufficiently reliable.
Proof of (4.11) Since the simplex ∆ k−1 is a convex and compact set and since q → d 1 (p, q) is a convex function, this function attains its maximum at an extreme point of the simplex, according to Bauer's maximum principle. These extreme points are the unit vectors. This implies (with e j the jth unit vector) max q∈∆ k−1 d 1 (p, q) = max (4.13) 2